Retrieval of Textual Song Lyrics from Sung Inputs
نویسنده
چکیده
Retrieving the lyrics of a sung recording from a database of text documents is a research topic that has not received attention so far. Such a retrieval system has many practical applications, e.g. for karaoke applications or for indexing large song databases by their lyric content. In this paper, we present such a lyrics retrieval system. In a first step, phoneme posteriorgrams are extracted from sung recordings using various acoustic models trained on TIMIT and a variation thereof, and on subsets of a large database of recordings of unaccompanied singing (DAMP). On the other side, we generate binary templates from the available textual lyrics. Since these lyrics do not have any temporal information, we then employ an approach based on Dynamic Time Warping to retrieve the most likely lyrics document for each recording. The approach is tested on a different subset of the unaccompanied singing database which includes 601 recordings of 301 different songs (12000 lines of lyrics). The approach is evaluated both on a song-wise and on a line-wise scale. The results are highly encouraging and could be used further to perform automatic lyrics alignment and keyword spotting for large databases of songs.
منابع مشابه
Rhyme and Style Features for Musical Genre Classification by Song Lyrics
How individuals perceive music is influenced by many different factors. The audible part of a piece of music, its sound, does for sure contribute, but is only one aspect to be taken into account. Cultural information influences how we experience music, as does the songs’ text and its sound. Next to symbolic and audio based music information retrieval, which focus on the sound of music, song lyr...
متن کاملMulti-modal Analysis of Music: A large-scale Evaluation
Multimedia data by definition comprises several different types of content modalities. Music specifically inherits e.g. audio at its core, text in the form of lyrics, images by means of album covers, or video in the form of music videos. Yet, in many Music Information Retrieval applications, only the audio content is utilised. Recent studies have shown the usefulness of incorporating other moda...
متن کاملLyrics-Based Audio Retrieval and Multimodal Navigation in Music Collections
Modern digital music libraries contain textual, visual, and audio data describing music on various semantic levels. Exploiting the availability of different semantically interrelated representations for a piece of music, this paper presents a query-by-lyrics retrieval system that facilitates multimodal navigation in CD audio collections. In particular, we introduce an automated method to time a...
متن کاملAddendum to “Multiple Lyrics Alignment: Automatic Retrieval of Song Lyrics” Technical Report
The purpose of this technical report is to discuss two additional aspects of automatic lyrics retrieval as described in “Multiple Lyrics Alignment: Automatic Retrieval of Song Lyrics” by Knees et al., 2005. The first aspect is the introduction of a confidence measure to estimate the quality of the generated output. The second aspect deals with the automatic formatting of generated lyrics to pre...
متن کاملRevisiting the dissociation between singing and speaking in expressive aphasia.
We investigated the production of sung and spoken utterances in a non-fluent patient, C.C., who had a severe expressive aphasia following a right-hemisphere stroke, but whose language comprehension and memory were relatively preserved. In experiment 1, C.C. repeated familiar song excerpts under four different conditions: spoken lyrics, sung lyrics on original melody, lyrics sung on new but fami...
متن کامل